b'U.S. DEPARTMENT OF COMMERCE\n          Office of Inspector General\n\n\n\n\n                 PUBLIC\n                RELEASE\n\n\n            BUREAU OF THE CENSUS\n\n Improvements Needed in Multiple Response\n     Resolution to Ensure Accurate, Timely\n  Processing for the 2000 Decennial Census\n\n        Inspection Report No. OSE-10711 / September 1999\n\n\n\n\n                             Office of Systems Evaluation\n\x0cU.S. Department of Commerce                                                                                    Report OSE-10711\nOffice of Inspector General                                                                                     September 1999\n\n\n                                                       Table of Contents\n\n\nEXECUTIVE SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i\n\nINTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1\n\nPURPOSE AND SCOPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2\n\nBACKGROUND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4\n\nOBSERVATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8\n\nI.        Dress Rehearsal Indicates That the Bureau Can Improve MRR Accuracy . . . . . . . . . . . . 8\n          A.     Dress Rehearsal Evaluation Raises Accuracy Issues . . . . . . . . . . . . . . . . . . . . . . . 8\n          B.     MRR Must Address Questionnaire Data Quality Problems . . . . . . . . . . . . . . . . 11\n          C.     MRR Rules and Software Warrant Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13\n\nII.       A More Structured Development Approach Would Improve MRR Software . . . . . . . . 15\n          A.    Bureau Software Engineering Standards Are Available but Not Widely Used . . 16\n          B.    The Bureau Should Define MRR Requirements More Explicitly . . . . . . . . . . . . 16\n          C.    MRR Software Testing Needs to Be More Complete . . . . . . . . . . . . . . . . . . . . . 19\n          D.    Beginning-to-End Testing Is Needed to Ensure Consistent Results . . . . . . . . . . 20\n\nIII.      Improvements Are Needed to Make MRR Processing Timely for the 2000 Census . . . 21\n\nCONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23\n\nRECOMMENDATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24\n\x0cU.S. Department of Commerce                                                   Report OSE-10711\nOffice of Inspector General                                                    September 1999\n\n                                  EXECUTIVE SUMMARY\n\nTo make it easier for people to be counted in the 2000 Decennial Census, the Census Bureau will\nprovide several ways for households to respond. In addition to collecting data through mailed\nquestionnaires and enumerator operations, the bureau will enable the public to initiate responses\non forms available in public places, by telephone, or the Internet. While multiple methods of\nresponding increase the opportunities for people to be counted, they also increase the likelihood\nof receiving multiple responses for some housing units, with some responses including the same\npeople and others reporting people for the first time. The bureau has devised a set of rules to\ncount individuals at each address and has automated this process, called multiple response\nresolution (MRR).\n\nWe conducted this evaluation to determine whether MRR is likely to resolve multiple responses\ncorrectly\xe2\x80\x94that is, when multiple responses are received, MRR will include people at an address\nwho resided there on Census Day (April 1, 2000) and omit people who did not reside there on\nCensus Day or whom MRR already counted via another response. We also sought to determine\nwhether MRR could accomplish its work in the allotted time during the decennial. Beginning in\nFebruary 1998, we observed the bureau\xe2\x80\x99s preparations for using MRR for the dress rehearsal and\nreviewed results from its processing of dress rehearsal response data.\n\nAccording to the Census Bureau, advances in computer technology provide the flexibility to offer\nmultiple response options without incurring undue risk to the accuracy of the resulting census\ndata. The bureau\xe2\x80\x99s Census 2000 Dress Rehearsal Report Card asserts error rates as low as 0.3\npercent for MRR. However, this statistic reflects the impact of MRR errors on the total\nenumeration. It does not reflect the accuracy of MRR in resolving multiple responses, which in\nthe bureau\xe2\x80\x99s dress rehearsal evaluations showed significantly higher error rates (see page 8).\n\nWe found several conditions that diminished the accuracy of MRR in the dress rehearsal. Issues\nassociated with questionnaire design, the way in which the public completed the questionnaire,\nand automated data capture sometimes caused inaccurate data to be sent to MRR processing,\nwhich then could not produce accurate results. In addition, the MRR software sometimes\nproduced questionable results in selecting the persons who make up a particular household and in\ndetermining whether duplicate sets of data represented the same person. MRR analysts are\nreviewing the dress rehearsal and evaluation data to determine what modifications are needed to\nthe rules and associated software for resolving multiple responses (see page 11).\n\nThe Census Bureau cannot ensure that MRR is implemented correctly because it did not use a\nsufficiently structured approach in developing the MRR software. A more structured approach,\nusing software engineering standards, would help ensure correct implementation of MRR. In\nparticular, conformance to a software engineering standard for specifying requirements would\nproduce a more accurate, complete requirements specification that would benefit users and\ndevelopers alike during software development and testing. The bureau has established such a\n\n                                                i\n\x0cU.S. Department of Commerce                                                      Report OSE-10711\nOffice of Inspector General                                                       September 1999\n\nstandard, but it has not yet been widely adopted by decennial staff. In addition, software testing\nneeds to be expanded to address more diverse response data and assure that MRR can reliably\nhandle a wide array of potential input conditions. Finally, the bureau needs to undertake a\ncoordinated effort to perform beginning-to-end testing of questionnaire data processing to ensure\nthat the output from each step is accurate and can be input into the next step without error (see\npages 15 and 20).\n\nAccording to the Census Bureau, MRR processing for the decennial census will need to be\ncompleted within 30 days to ensure timely availability of data for follow-on processing\noperations. However, dress rehearsal results indicate that unless improvements are made, MRR\nprocessing for the decennial census will require approximately 87 days. The bureau needs to\nmodify MRR software to reduce excessive processing time and obtain additional headquarters\ncomputing resources (see page 21).\n\nAs a result of these issues, important decisions concerning MRR requirements and design must\nbe made and significant development and testing work accomplished. The bureau has\nrecognized many of the problems that we have discussed and is analyzing the dress rehearsal data\nand working on improvements to MRR. The bureau has also made substantial improvements to\nthe data capture system since the dress rehearsal. However, the bureau has not fully defined the\nactivities needed to refine MRR and complete its development. To ensure that MRR is correctly\nimplemented and well tested for the 2000 Decennial Census, the bureau should define these\nactivities and develop and implement a plan for their completion (see page 23).\n\nOur complete recommendations begin on page 24.\n\n\n\nIn its response to our draft report, the bureau stated that, with one exception, it concurs with, or\nhad already acted upon, our recommendations. The exception regards specifying the software\nrequirements necessary to address data quality issues, such as how to handle incomplete,\ninconsistent, or erroneous data (Recommendation No. 4d). The bureau requested clarification of\nthis recommendation, which we provide on pages 18-19. Although the bureau concurred with\nour recommendation to perform beginning-to-end testing of the questionnaire data processing\nstream (Recommendation No. 5f), we believe that the planned testing, while valuable and\nneeded, will not fully address the concerns we identify in this report. Our comments on pages\n20-21 describe the type of testing that would be responsive to our recommendation. The bureau\nalso provided comments on several aspects of our observations. Based on these comments, we\namplified our discussion of questionnaire design and modified the title of Observation II. B. to\navoid any confusion between establishing standards for MRR accuracy and writing software\nrequirements specifications according to software engineering standards. The bureau\xe2\x80\x99s response\nis included in its entirety as the appendix to this report.\n\n\n                                                 ii\n\x0cU.S. Department of Commerce                                                                    Report OSE-10711\nOffice of Inspector General                                                                     September 1999\n\n\n                                              INTRODUCTION\n\nThe Bureau of the Census conducted the 1998 Dress Rehearsal to test methods and systems that\nare planned for the 2000 Decennial Census. Because of the importance of the dress rehearsal to\nthe success of the decennial census, we have been evaluating critical information technology\ncomponents used to conduct the dress rehearsal, including data capture, the personnel and\npayroll processing systems, and headquarters processing. This report addresses multiple\nresponse resolution (MRR), a component of headquarters processing. Our evaluation of data\ncapture was presented in a report entitled, Data Capture System 2000 Requirements and Testing\nIssues Caused Dress Rehearsal Problems.1 We found that the data capture system experienced\nserious problems in dress rehearsal resulting from insufficient testing and from difficulties in\ncontrolling and communicating requirements.\n\nThe availability of multiple methods of responding to the census has significantly increased the\npotential for persons and households to submit multiple responses. In addition to bureau-\ninitiated enumeration, the bureau will enable the public to initiate responses on forms available in\npublic places or via telephone or the Internet. While multiple methods of responding increase the\nopportunities to ensure that people have been counted, they also require the capability to resolve\nsituations in which more than one response for a housing unit2 is received.3 In the past, the\nbureau has tested some operations requiring resolution of multiple responses and has developed\nautomated and clerical methods to process them. New operations such as the Be Counted\nprogram4 create the prospect of receiving forms from unknown residences and obtaining\nfragmented household responses\xe2\x80\x94adding new complexity to data processing development tasks.\n\nThe Census Bureau developed MRR software to consolidate questionnaire data submitted for a\nresidence on multiple forms. The bureau used MRR software to process 1998 Dress Rehearsal\nresponse data. MRR is one step in the multi-step process within headquarters computer\nprocessing that culminates in the apportionment counts and other products of tabulating census\n\n\n         Data Capture System 2000 Requirements and Testing Issues Caused Dress Rehearsal Problems, OSE-\n         1\n\n10846, January 1999.\n         2\n          A housing unit is a house, an apartment, a mobile home or trailer, a group of rooms, or a single room\nintended for occupation as separate living quarters to which there is direct access from outside the building or\nthrough a common hall.\n         3\n         Resolving more than one response for a housing unit means choosing the appropriate data from multiple\ncensus questionnaires identified with that housing unit to represent that housing unit in the census.\n         4\n          The Be Counted program targets areas that are traditionally undercounted by placing census\nquestionnaires at accessible sites so people who believe they were not counted can pick them up, complete them,\nand mail them to the bureau.\n\n                                                          1\n\x0cU.S. Department of Commerce                                                                Report OSE-10711\nOffice of Inspector General                                                                 September 1999\n\ndata. MRR processes all housing unit questionnaire data by comparing data on all forms\nreceived for an address, identifying duplicate persons, and choosing members of each household.\nSince this processing affects the counts, it is vital that MRR software operate correctly and\nreliably.\n\n\n                                          PURPOSE AND SCOPE\n\nIn November 1997 we reported that the bureau was developing software for decennial\nheadquarters processing without a well-defined software development process based on software\nengineering principles.5 To follow up on that work, we conducted this evaluation to study one\nsystem in depth, MRR. Our inspection objectives were to: (1) evaluate MRR requirements\nspecifications to determine if they are appropriate, sufficiently defined, and clearly\ncommunicated, (2) determine if the data and software design will result in efficient, reliable\nprocessing, (3) determine if a comprehensive set of test cases that fully tests the requirements and\na definitive method that correctly evaluates the outcome have been articulated, and (4) assess\nplans to capture lessons learned from the 1998 Dress Rehearsal and apply them to the 2000\nDecennial Census.\n\nDuring our evaluation, we reviewed requirements specifications, design documentation, software\ncode, test plans, test data, test results, and Census Bureau evaluation reports pertaining to the\ndress rehearsal. We also reviewed relevant documentation and operational assessments from the\n1995 and 1996 census tests, and available dress rehearsal data processing results. In addition, we\nmet with representatives from the Census Bureau\xe2\x80\x99s Decennial Systems and Contracts\nManagement Office, Decennial Statistical Studies Division, Statistical Research Division,\nSystems Support Division, and the MRR team.\n\nWith respect to our objectives to evaluate the requirements specification and testing, we analyzed\ntwo key documents6 regarding the two principal MRR elements, Within-Block Search (WBS)\nand Primary Selection Algorithm (PSA). The remainder of this report refers to these documents\nas the specification and the test plan, respectively:\n\nl       Revised Specifications for the Within-Block Search and Primary Selection Algorithm in\n        the 1998 Dress Rehearsal, Bureau of the Census, May 27, 1998.7\n\n\n        5\n       Headquarters Information Processing Systems for the 2000 Decennial Census Require Technical and\nManagement Plans and Procedures, OSE-10034, November 1997.\n        6\n         These documents are Census Confidential, which means that they are available to personnel involved in\ndeveloping, testing, and required evaluations.\n        7\n         An algorithm is a set of steps for solving a particular problem.\n\n                                                          2\n\x0cU.S. Department of Commerce                                                                    Report OSE-10711\nOffice of Inspector General                                                                     September 1999\n\n\nl        Census 2000 Dress Rehearsal Test and Acceptance Plan for the Within-Block Search\n         (WBS) and Primary Selection Algorithm (PSA), Bureau of the Census, September 11,\n         1998.\n\nTo assist in our evaluation of the bureau\xe2\x80\x99s software testing, we also reviewed input and output\ndata made available to us from MRR testing efforts. We used the bureau\xe2\x80\x99s documentation on\napplicable data files to understand this data.\n\nTo determine whether the data and software design would result in efficient, reliable processing,\nwe reviewed design documentation, code, the structure of the decennial response file, and logs\nfrom the September 2, 1998, production processing of dress rehearsal data.\n\nTo assess the ability of the software to determine matches of person data accurately, we reviewed\noutput of the Service-Based Enumeration (SBE)8 component of dress rehearsal processing, which\nemploys the same matching software as MRR.\n\nTo assess how the bureau plans to capture lessons learned from the 1998 Dress Rehearsal and\napply them to the 2000 Decennial Census, we reviewed the MRR evaluations:9\n\nl        Evaluation Memorandum F1c and F2b, The Within-Block Search and Primary Selection\n         Algorithm Operational Evaluation, January 31, 1999,\n\nl        Census 2000 Dress Rehearsal Evaluation Results Memorandum Series #F2a, Within-\n         Block Search Expansion Evaluation Draft, March 11, 1999, and\n\nl        Census 2000 Dress Rehearsal Evaluation Results Memorandum Series #F1b, Results of\n         the Evaluation of the Primary Selection Algorithm, [undated].\n\nWe also evaluated the bureau\xe2\x80\x99s efforts to modify the implementation of these elements resulting\nfrom lessons learned from the dress rehearsal.\n\nThe purpose of this report is to inform the bureau of difficulties that we observed in\nimplementing MRR. The team working to develop MRR has done a commendable job on a\ncomplex problem under difficult circumstances. We urge bureau management to support the\nteam in refining and thoroughly testing MRR to ensure high quality results in the decennial\ncensus.\n\n\n         8\n           The SBE operation is designed to enumerate people at facilities where they might receive services, such as\nshelters, soup kitchens, health-care facilities and other selected locations. This operation targets the types of\nservices that primarily serve people who have no usual residence.\n         9\n          The bureau restricts distribution of these dress rehearsal evaluation reports regarding MRR.\n\n                                                         3\n\x0cU.S. Department of Commerce                                                               Report OSE-10711\nOffice of Inspector General                                                                September 1999\n\nThis inspection has been conducted in accordance with the Inspector General Act of 1978, as\namended, and the Quality Standards for Inspections, March 1993, issued by the President\xe2\x80\x99s\nCouncil on Integrity and Efficiency.\n\n\n                                             BACKGROUND\n\nBecause the 2000 Decennial Census will provide several ways for households to respond, it is\nhighly probable that there will be multiple responses for some housing units. Therefore, the\ncomplete set of census response data for housing units must be viewed as potentially including\n(a) multiple responses for a housing unit having person data in common, (b) multiple responses\nfor a housing unit not having person data in common, and (c) person data in common among\nresponses of different housing units. The Census Bureau has implemented MRR computer\nprocessing to select the residents of a housing unit that has submitted multiple responses and to\nprevent data resulting from multiple responses for the same person from being erroneously\nincluded in the census. Each housing unit has a unique identifier called a census housing unit ID,\nwhich processing prior to MRR uses to associate responses with an address. Then, MRR uses\ntwo elements to search for person matches among multiple returns. WBS resolves duplicate\nperson data among returns at the same or different addresses within a defined area when those\npersons have been reported on Be Counted Form Equivalents (BCFEs),10 while PSA resolves\nmultiple responses within the census housing unit ID.\n\nEarlier Experiences with Multiple Responses for a Census Housing Unit ID\n\nIn the 1990 Decennial Census, multiple responses occurred for a census housing unit ID because\n1990 had overlap between mail and non-response follow-up (NRFU) enumerations. The bureau\nconsequently received both mail and enumerator returns for some households. Also, data capture\nallowed corrections to the paper form which was then recaptured. This recycling of the paper\nform through data capture could occur repeatedly. To select the \xe2\x80\x9cbest\xe2\x80\x9d questionnaire record for\nan ID, the bureau designed and implemented the PSA program.11 This PSA version selected the\ndata of one primary questionnaire (with possible supplemental questionnaires for large\nhouseholds) with all persons intact to represent the census housing unit ID.\n\nAdditional opportunity for multiple responses arose in the 1995 Census Test, which was\nconducted at three sites. In addition to mail returns, the public could respond by using widely\n\n\n        10\n           BCFEs include the paper \xe2\x80\x9cBe Counted\xe2\x80\x9d questionnaire forms, Telephone Questionnaire Assistance (TQA)\ninterviews initiated by the respondent, paper mail return questionnaires sent by TQA upon request of the\nrespondent, and Individual Census Questionnaires (ICQ), Individual Census Reports (ICR), and Military Census\nReports (MCR) if the ICQ, ICR, or MCR have been assigned a census housing unit ID.\n        11\n          Description of the Decennial Census Algorithm for the Selection of the Primary and Supplemental\nRecords from the 1990 FOSDIC Data Capture Files, Susan P. Love, November 27, 1990.\n\n                                                       4\n\x0cU.S. Department of Commerce                                                          Report OSE-10711\nOffice of Inspector General                                                           September 1999\n\ndistributed \xe2\x80\x9cBe Counted\xe2\x80\x9d forms and by telephone. Data capture keying methods input persons\xe2\x80\x99\nnames as well as population data (e.g., sex, age, and race). The bureau designed a more complex\nPSA to resolve multiple responses for each ID by selecting unique persons and marking duplicate\nperson entries. An automated matching algorithm compared person data on pairs of responses\nand scored resultant pairs of persons based on the weights of matching data it found. Multiple\nresponses requiring computer matching occurred for 2.64 percent of the total 178,680\nhouseholds, or some 4,717 households. Under some conditions, clerks reviewed the multiple\nresponses for an ID to minimize matching errors. Clerical review resolved 17 percent of cases\nsubmitted to computer matching.12 A slightly modified 1995 version of PSA also processed data\ncollected during the 1996 Community Census. However, the bureau eliminated the clerical\nreview step because of insufficient time to conduct such a review.\n\nThe MRR Team\n\nA team of analysts has defined the operational concepts and system requirements for MRR for\nthe 1998 Dress Rehearsal and 2000 Decennial Census. Since the dress rehearsal and decennial\ncensus employ operations for obtaining responses that are new or expanded from the mid-decade\ntests, the likelihood of multiple responses for households and persons at census housing unit IDs\nhas increased. The MRR team is responsible for producing the specifications that define the\nsoftware requirements. Permanent members of the MRR team include personnel from the\nDecennial Statistical Studies Division and Decennial Systems and Contracts Management Office.\nMembers from the former office are statisticians who provide the information on the functions\nthat MRR must perform and review the input data and results for correctness. The latter office\ncontains the processing systems group, which is responsible for developing the software for\nseveral processing components including MRR and running them operationally to produce\nofficial results. The software developers have also worked on defining the functional and\nsoftware requirements. Since requirements specifications reflect decisions regarding statistical\nmethods, Decennial Statistical Studies Division management signature is required to indicate that\nconsensus has been reached before releasing the specifications to software developers.\n\nThe MRR team has had to contend with issues such as determining the geographic boundaries\nwithin which to search for duplicate responses for a person reported on a BCFE, deciding the\ncriteria to use to define a household and to select the primary household, and specifying what\ndata are required to qualify a person entry on a questionnaire as having information sufficient for\nmatching. For the dress rehearsal, the team also had responsibility for developing the Invalid\nReturn Detection requirements. However, because detection of invalid (or fabricated) returns is\nnot a multiple response issue, it is not a responsibility of the MRR team for the Census 2000\nprogram. The team also engineered and conducted system testing, which included developing\n\n\n\n        12\n         1995 Census Test Results Memorandum No. 50, Theresa Leslie and Maureen Lynch, Bureau of Census,\nAppendix 4, pp. 4-5.\n\n                                                    5\n\x0cU.S. Department of Commerce                                                                  Report OSE-10711\nOffice of Inspector General                                                                   September 1999\n\nthe test plan, preparing test decks to model various multiple response scenarios, conducting the\ntesting, analyzing the results, and correcting and re-testing the software as necessary.\n\nMajor Elements of MRR\n\nMRR consists of a sequence of processing steps that is executed in batch (non-interactive) mode.\nFigure 1 shows a simplified view of headquarters processing leading up to creation of the Census\nUnedited File (CUF),13 with the scope of MRR processing outlined in blue. The elements of\nMRR are described as follows:\n\nl       Decennial Response File - Stage 2 (DRF2)14 is both the input and output file for MRR\n        and contains data fields whose values are compared to determine matching person\n        records.\n\nl       WBS is a software program that performs a search and match operation for DRF2 person\n        data records submitted on BCFEs when there is more than one response for a census\n        housing unit ID. The search is conducted among census housing unit IDs within a\n        defined area. If a match is found, WBS chooses one person entry and marks the other for\n        deletion. WBS was developed and used operationally for the first time in the dress\n        rehearsal.\n\nl       PSA, whose implementation is based on the software program developed for the 1995\n        Census Test, determines which DRF2 person records for a given census housing unit ID\n        are to be counted. Since it is possible for the same person to be included on different\n        census forms for the same census housing unit ID, PSA performs a search and match\n        operation between the person records on pairs of response records for an ID. The PSA\n        uses the presence or absence of person matches along with a set of rules to create groups\n        of person records that are considered households by the PSA process. Other rules are\n        used to determine exactly which persons are to be included at that ID.\n\nl       Statistical Research Division (SRD) Matcher is a software program invoked by both\n        the WBS and PSA elements to match pairs of person records. The matcher assigns\n\n\n        13\n          The output from PSA retains the name of DRF2 and is used to create the CUF, which merges census\nhousing unit ID information from the Decennial Master Address File (DMAF) with data associated with each\ncensus housing unit ID from the DRF2. The CUF also contains standard codes for fields that may contain write-in\ninformation such as race and relationship for easy tabulation.\n        14\n           The DRF is processed in two stages. Stage 1 creates DRF1 with the receipt of the first response records\nand is conducted on a continuing basis as response records are delivered to bureau headquarters from data capture\noperations. Selected DMAF variables are added to the DRF1 as it is sorted by block and census housing unit ID\nand before it is input to DRF2 creation. DRF1 is input to stage 2 which includes MRR and runs once to create a\nfinal DRF2.\n\n                                                         6\n\x0cU.S. Department of Commerce                                     Report OSE-10711\nOffice of Inspector General                                      September 1999\n\n\n\n\n  Figure 1. Scope of MRR Processing\n\n\n\n        DRF1                           PRE-MRR\n                                      PROCESSING\n\n\n\n                                        DRF2\n\n\n\n\n                                        WBS\n                                             SRD Matcher\n\n\n\n\n                                        DRF2\n\n\n\n\n                                        PSA\n                                             SRD Matcher\n\n\n\n\n                                        DRF2\n\n\n\n                                          CREATE CENSUS UNEDITED FILE (CUF)\n\n\n\n\n                                         7\n\x0cU.S. Department of Commerce                                                                Report OSE-10711\nOffice of Inspector General                                                                 September 1999\n\n        weights based on probabilities to each characteristic compared, and adds them to arrive at\n        a score. WBS and PSA compare the score to a cut-off weight to determine whether to\n        treat the pair as a match. WBS and PSA invoke the same version of the matcher\n        program; however, WBS requires more restricted conditions for matching persons from\n        different census housing unit IDs than PSA uses for matching persons in the same ID.\n\n\n                                              OBSERVATIONS\n\nI.      Dress Rehearsal Indicates That the Bureau Can Improve MRR Accuracy\n\nThe bureau has issued its dress rehearsal report card,15 which assesses the performance of six\ndress rehearsal operations. One of those operations is MRR, the focus of this evaluation.\nDespite the high accuracy rating attributed to MRR by the report card, we have found areas of\nconcern to include (1) the report card rating of MRR versus evaluation results, (2) accuracy of the\nquestionnaire data, and (3) unexpected results from MRR processing that warrant review.\n\nA.      Dress Rehearsal Evaluation Raises Accuracy Issues\n\n The bureau\xe2\x80\x99s August 1997 Report to Congress\xe2\x80\x94The Plan for Census 2000 states, \xe2\x80\x9cAdvances in\ncomputer technology in the areas of computer storage, retrieval, and matching, along with image\ncapture and recognition, have now given the Census Bureau the flexibility to provide multiple\nresponse options without incurring undue risk to the accuracy of the resulting census data.\xe2\x80\x9d16\nThis statement prompts the question of how accurately multiple responses actually are resolved.\nThe Census 2000 Dress Rehearsal Evaluation Program has planned and implemented evaluations\non some aspects of MRR and reported the results. The bureau has issued the Census 2000 Dress\nRehearsal Report Card, which states the following accuracy levels for MRR processed data:\n\nl       For the type of error of including persons that a follow-up interview determined were not\n        residents, the percentage was 0.3 percent for both the Sacramento and Columbia, South\n        Carolina dress rehearsal sites.\n\nl       For the type of error of omitting persons that a follow-up interview determined were\n        residents, the percentage was 0.4 percent for Sacramento and 0.3 percent for Columbia.\n\nThe bureau\xe2\x80\x99s dress rehearsal evaluation plan had set accuracy standards for MRR based on the\npercent of erroneous enumerations in the 1990 Census as measured by the Post Enumeration\n\n\n        15\n          Census 2000 Dress Rehearsal Report Card \xe2\x80\x94Evaluation of the Standards for Success, Bureau of the\nCensus, February 1999.\n        16\n             Report to Congress\xe2\x80\x94The Plan for Census 2000, Bureau of the Census, August 1997, p.13.\n\n                                                        8\n\x0cU.S. Department of Commerce                                                                Report OSE-10711\nOffice of Inspector General                                                                 September 1999\n\nSurvey (PES). These standards were 4.6 percent for erroneously included persons (overcount)\nand 1.3 percent for erroneously omitted persons (undercount).17 Since the reported error rate for\nerroneously including persons for both sites was 0.3 percent, the report card claimed that this\nerror rate met the standard. Similarly, reported error rates for erroneously excluding persons also\nmet the standard.\n\nThe report card obtained these error rates from the draft report for the Evaluation F1b,\nEvaluation of the Primary Selection Algorithm.18 This evaluation report puts these error rates in\nperspective by stating that if the evaluation method had found that all persons selected by the\nPSA were selected in error, then the selected in error rate would be 3.1 percent in Sacramento\nand 2.3 percent in South Carolina. Similarly, if the evaluation method had found that all persons\nexcluded by the PSA were excluded in error, then the excluded in error rate would be 1.0 percent\nin Sacramento and 0.8 percent in South Carolina.\n\nSo, even if the PSA element of MRR operated completely in error (excluded all residents and\nincluded all non-residents found on multiple responses), the values that would then be used in the\nreport card would still meet the standards set in the evaluation study plan. The report card\nconcluded that MRR did not create a problem relative to the overall accuracy of dress rehearsal\nenumerations.19\n\nBureau statisticians who define and develop MRR disagree with the use of PES percentages of\nerroneous inclusions and exclusions as the standard to which MRR results should be held. The\npercentage values are obtained by comparing response data from the census to data gathered\nduring the PES. This comparison provides the numbers of matching and non-matching responses\nfrom the two operations. Matching and non-matching values support the dual-system estimation\nmethod20 used to estimate the true population. Bureau statisticians working on MRR state that\ncalculating percentages of matches and non-matches and using them as the standard for MRR is\ninvalid because these discrepancies occur for many reasons in addition to incorrect decisions by\nMRR. Recognizing the controversy within its own staff, the bureau later amended the standard\nto read, \xe2\x80\x9csmall relative to 4.6 percent\xe2\x80\x9d for the number of persons included in error and \xe2\x80\x9csmall\nrelative to 1.3 percent\xe2\x80\x9d for the number of persons excluded in error. The bureau should develop\nan appropriate standard of measuring the success of MRR processing.\n\n\n          17\n          Census 2000 Dress Rehearsal Evaluation Program Draft, Bureau of the Census, March 10, 1998,\nrevised May 15, 1998, Appendix E.\n          18\n           Census 2000 Dress Rehearsal Evaluation Results Memorandum Series #F1b, Results of the Primary\nSelection Algorithm Draft, Bureau of the Census, [undated].\n          19\n               Report Card \xe2\x80\x94Evaluation of the Standards for Success, p. 8.\n          20\n               Tommy Wright, \xe2\x80\x9cSampling and Census 2000, The Concepts,\xe2\x80\x9d American Scientist, May-June 1998,\np. 252.\n\n                                                            9\n\x0cU.S. Department of Commerce                                                           Report OSE-10711\nOffice of Inspector General                                                            September 1999\n\nThe report card identified the impact of MRR errors on the total enumeration, but it did not\nreport the accuracy of MRR in resolving multiple responses. The manner in which the error rates\nin the report card were calculated resulted in substantial understatement of both the erroneous\ninclusions and erroneous exclusions produced by MRR. For example, in Sacramento, the\nnumber of persons belonging to households was about 300,000\xe2\x80\x94which was the value used in the\ndenominator of the erroneous inclusion rate. However, PSA obtained data from multiple\nresponses for approximately 10,000 persons of that population. When calculating the error rate\nwith the denominator equal to this value, the result is 9 percent. This is 30 times the error rate\npresented by the report card. Table 1 shows a summary of error rates for the two types of errors\nthat the bureau evaluated for the dress rehearsal sites\n\n                                              Table 1\n                              Dress Rehearsal Multiple Response Rates\n\n\n Site             Type of Error       PES Standard     Report Card   Total Possible     Multiple Response\n                                                                      MRR Error            Error Rate\n\n Sacramento       Included in error        4.6%            0.3%          3.1%                 9%\n\n                  Excluded in error        1.3%            0.4%          1.0%                 11%\n\n Columbia         Included in error        4.6%            0.3%          2.3%                 14%\n\n                  Excluded in error        1.3%            0.3%          0.8%                 11%\n\n\n\nThe bureau\xe2\x80\x99s dress rehearsal evaluation provides data that the MRR team is analyzing to adjust\nthe algorithm for the decennial. This effort is important because the decennial will increase the\nnumber and variety of conditions under which multiple responses occur. Modifying MRR to\nprocess multiple response data more accurately will aid in reducing the overcount and the\nundercount that could potentially be caused by multiple responses.\n\nl       The third column, PES Standard, shows the PES standard as previously discussed.\n\nl       The fourth column, Report Card, shows the error rates presented in the report card.\n\nl       The fifth column, Total Possible MRR Error, shows the percentage value of the ratio of\n        person data originating on multiple responses to the site\xe2\x80\x99s total population. These values\n        indicate the percentage of error MRR would contribute to the total count if MRR resolved\n        all multiple responses erroneously.\n\nl       The sixth column, Multiple Response Error Rate, shows the error rates calculated by\n        dividing the weighted number of persons in error by the total weighted number of persons\n        whose data was obtained from multiple responses. We believe that the error rates shown\n\n                                                     10\n\x0cU.S. Department of Commerce                                                               Report OSE-10711\nOffice of Inspector General                                                                September 1999\n\n        in the sixth column give a more valid measure of how well MRR software made the\n        decision to include or exclude persons in the household. These equal the error rates\n        calculated by the Census Bureau\xe2\x80\x99s evaluation team.\n\nThe report card\xe2\x80\x99s use of error rates which were so low as to suggest that MRR worked nearly\nperfectly is misleading. Inferences from these error rates would suggest that any efforts to\nimprove MRR are unnecessary. However, the evaluation data indicates weak areas where\nimprovement is needed. The MRR team is using the evaluation data to improve MRR\nprocessing, and we encourage the bureau to support the team in this effort. In reporting on MRR\nperformance in the future, the bureau should identify both the accuracy of MRR in resolving\nmultiple responses and the impact of MRR errors on the total enumeration.\n\nB.      MRR Must Address Questionnaire Data Quality Problems\n\nData quality affects the accuracy of any processing. Obtaining accurate input data starts with the\nrespondent filling out the census form legibly and with complete and correct information\nproperly placed on the form. Next, the data capture operation submits these forms to the\nautomated system, DCS 2000, which converts write-in and check box data from the\nquestionnaires into standard computer format. If DCS 2000 inaccurately converts this data to\ncomputer format, those errors propagate through all downstream processing including MRR.\nMissing, incomplete, or erroneous data in certain questionnaire fields affects the ability of MRR\nto match person data.\n\nThe DCS 2000 program has made significant improvements to the accuracy of data capture since\nthe dress rehearsal and is continuing to do so.21 Errors will remain, however, because no method\nof capturing data from census questionnaires will be error free. We also recognize that at this\nlate date, it is not advisable or feasible to make changes to the questionnaires. However, when\nrefining and improving MRR, it is important for the bureau to be aware of problematic\nconditions that the software must handle. The following discussion is intended to raise the\nvisibility of these conditions.\n\nThe questionnaires\xe2\x80\x99 lack of instructions or questions to obtain necessary information, phrasing of\ninstructions and questions, and layout of blank fields where respondents write their data have\nconfused some respondents. The quality of some responses indicates the following questionnaire\nissues:\n\n\n\n        21\n           The most recent analysis conducted by the RIT Research Institute showed accuracy rates for short form\nmail return write-in data of between 99.36 percent and 99.56 percent including blank fields and between 97.33\npercent and 98.14 percent excluding blank fields. For check box data, the accuracy rate including all fields was\n99.47 percent and excluding blank fields was 98.82 percent. See Joint R&D Project to Advance Technology for\nData Capture System, Management Summary for February 1999, RIT Research Corporation, Rochester, NY, March\n15, 1999.\n\n                                                      11\n\x0cU.S. Department of Commerce                                                            Report OSE-10711\nOffice of Inspector General                                                             September 1999\n\nl       The questionnaires do not explain that an automated system will read and interpret the\n        responses and that for this automated process to work accurately, respondents must print\n        well-formed characters in the spaces provided.\n\nl       The forms do not include a warning that the automated process cannot capture responses\n        in red ink.\n\nl       The forms do not show how to mark an answer indicator box so it is machine-readable.\n        Also, some questions require exactly one choice while others allow one or more. Some\n        respondents appeared to be confused and supplied more than one response when exactly\n        one was required.\n\nl       The forms do not provide separate space for a generational suffix field (such as Jr., Sr.,\n        III). The generational suffix helps distinguish between two person entries with identical\n        first and last names, and lack of this information has led to suspected errors.\n\nSome MRR test case results demonstrated that since there is no provision for the respondent to\nindicate that more than one household cohabits the same residence, when two families at the\nsame census housing unit ID each submit their own non-BCFE questionnaire response, MRR\nwill choose one family over another when both should be recorded at that address.\n\nIf the respondent omits certain information, the data cannot be compared to other more complete\nentries to detect matching responses for a census housing unit ID. As noted in the OIG report,\nColumbia Dress Rehearsal Experience Suggests Changes to Improve Results of the 2000\nDecennial Census,22 there were numerous instances in SBE operations in which the same person\nwas counted multiple times by enumerators but the questionnaires provided incomplete data.\nThese duplicate responses were not detected because of insufficient information. Similarly,\nincomplete person responses on housing unit records, particularly from enumerator operations,\nhave been observed by census personnel to impair correct resolution of multiple responses. The\nbureau should stress during enumerator training how incomplete data impacts the accuracy of\ncensus results.\n\nAlso, bureau analysts have expressed concern about the reliance of DCS 2000 on optical\ncharacter recognition (OCR) and optical mark recognition (OMR), and are exploring ways to\nimprove the keying of data that the system cannot accurately process. DCS 2000 was designed to\nidentify questionable data and send it for human review and correction through the process\n\n\n\n\n        22\n          Columbia Dress Rehearsal Experience Suggests Changes to Improve Results of the 2000 Decennial\nCensus, Inspection Report No. ESD-10783-8-0001, September 1998.\n\n                                                    12\n\x0cU.S. Department of Commerce                                                               Report OSE-10711\nOffice of Inspector General                                                                September 1999\n\nknown as key from image (KFI).23 During dress rehearsal, numerous instances of converted data\nwere sent to KFI because of low confidence in OCR and OMR results. However, limitations in\nthe types of review and corrections permitted during dress rehearsal did not always enable\ncomplete and efficient correction of erroneous data. In some cases, fuller portions of the\nquestionnaire needed to be viewed for the KFI operator to know what the correction should be.\nAlso, in dress rehearsal, the operators were instructed to key what they saw on the form without\nusing human judgement, even if what was written was clearly invalid. For example, a respondent\nwould write \xe2\x80\x9cMay\xe2\x80\x9d in the month of birth field instead of the acceptable numeric value \xe2\x80\x9c05.\xe2\x80\x9d\nAccording to instructions, the operator was not to convert the letter version to the number.\nDiscussions are underway on how best to develop \xe2\x80\x9cintelligent keying\xe2\x80\x9d guidelines to mitigate this\nlimitation for the decennial.\n\nThe forms filled out by the public repeat the questions for each person. Not only does that\nrequire more paper for each questionnaire, it slows the respondent down by inhibiting an efficient\nway of answering the questions, and it complicates the data capture operation by requiring more\nsheets and folds of sheets of paper to be handled. During dress rehearsal, there were mismatched\nsheets among questionnaires as a result. The enumerator questionnaire for the short form has the\nmost efficient format where data for the whole household can be completed and viewed in rows\nand columns across one open sheet. At this late date, we are not advocating questionnaire\nredesign, but in future censuses, developing questionnaires and instructions that better facilitate\naccuracy should be a top priority.\n\n\n\n\nIn its response, the bureau maintains that the questionnaires\xe2\x80\x99 design is based on extensive\ncognitive research and is constructed to lead people to respond as accurately as possible.\nHowever, processing problems did arise from the questionnaires\xe2\x80\x99 design as discussed in this\nsection. Further, the bureau maintains that research has found that extensive instructions do not\nincrease the accuracy of the responses. An approach that does not require extensive instructions\nmight be to include a brief caveat at the beginning of the questionnaire that explains that the form\nis machine-read. This would alert the respondent to confine the answers to spaces provided.\nMore balance in weighing the trade-offs between \xe2\x80\x9cuser-friendly\xe2\x80\x9d and technically feasible could\nimprove the quality of the processed response data.\n\nC.      MRR Rules and Software Warrant Review\n\nSeveral members of the MRR team are analyzing results from MRR processing of dress rehearsal\ndata because of questionable resolution of multiple responses. The two elements that they are\n\n\n        23\n           Data Capture System 2000 resorts to KFI, and Key-From-Paper (KFP) under specific conditions that\nindicate probable OCR and OMR errors.\n\n                                                      13\n\x0cU.S. Department of Commerce                                                           Report OSE-10711\nOffice of Inspector General                                                            September 1999\n\nprimarily analyzing are the order in which PSA applies its selection criteria and the correctness of\nthe matcher\xe2\x80\x99s rating of pairs of person data to determine if they represent the same person. This\nsection discusses the rationale motivating this analysis and subsequent modification of the\nsoftware to improve its accuracy in performing these functions.\n\nDuring dress rehearsal, there were instances where the matcher results were not what the bureau\nanalysts expected and did not appear to be accurate. Questionable matches were identified\nduring processing of SBE data, which uses the same version of the matcher as MRR and the\nsame constraints as the outside-census housing unit ID portion of WBS. In addition, the matcher\nhas given persons sharing birthdays or close in age and having similar names scores that indicate\nthey are duplicates. However, upon close examination of such duplicate pairs, we questioned\nwhether such persons might be twins, cousins, or related in some other way. We also questioned\nremoval of persons who had some data in common, but other data that was considerably\ndifferent. Removing one of such matching pairs erroneously would make the final results less\naccurate. Review of MRR testing revealed similarly questionable results. The \xe2\x80\x9cResults and\nAnalysis\xe2\x80\x9d section of the test plan acknowledges these instances and notes that a clerical\nprocedure might preclude such questionable matches.24\n\nMRR uses the version of the matcher which the bureau modified for the 1995 test to\naccommodate commonly made typographical errors. The bureau now utilizes DCS 2000, which\nprimarily uses OCR to capture handwritten responses. However, the bureau has not analyzed\nOCR errors that the matcher could also accommodate. Because OCR introduces different errors\nfrom keyed data, we are concerned that some of the decisions made by the matcher in its scoring\nprocess may not be appropriate. Since the input data is obtained through OCR, and keying when\nDCS 2000 has calculated a low confidence in the OCR data, the bureau needs to review matcher\nperformance against these conditions to ensure accurate matching.\n\nBecause of concerns about the matcher software\xe2\x80\x99s performance in operations during the dress\nrehearsal, the lead programmer of MRR has initiated an effort to assess the matcher\xe2\x80\x99s\nperformance on dress rehearsal data. The assessment will use a subset of the dress rehearsal data\nconsisting of all households for which multiple returns were received. After establishing\nguidance for each member to use to determine whether persons from different returns match, the\nteam is manually identifying all person matches across all returns for a household. This effort\nwill establish a truth set to serve as the basis for assessing matcher performance. The team plans\nto use the matcher software to identify matches across these same returns and then determine\nhow closely the matcher results correspond to the truth set established manually. Current plans\nare to adjust the parameter file used by the matcher to score the data until results agree closely\nwith the results of the manual matching.\n\n\n\n        24\n         Census 2000 Dress Rehearsal Test and Acceptance Plan for the Within-Block Search (WBS) and\nPrimary Selection Algorithm (PSA), Bureau of the Census, September 11, 1998, p. 81.\n\n                                                    14\n\x0cU.S. Department of Commerce                                                      Report OSE-10711\nOffice of Inspector General                                                       September 1999\n\nBureau management from the Decennial Systems and Contracts Management Office have made\noutput data from DCS 2000 testing available to bureau personnel. Bureau management should\ncontinue this sharing of data to the extent necessary to improve the matcher and DCS 2000 in\nunison. For example, bureau personnel have developed the capability of viewing the\nquestionnaire image in conjunction with the resulting computer standard format for each field.\nUse of such a tool would facilitate the debugging of both MRR and DCS 2000. We further\ndiscuss coordinated testing that checks the correctness of each step that processes the\nquestionnaire data in a later section on beginning-to-end testing.\n\nIn addition to issues associated with the matcher element of MRR, questions remain on how\neffectively the PSA element applies criteria to choose members of a household. As noted above\nin the discussion of the report card and evaluation results, evaluation data provides the analysts a\nbasis for determining how well the criteria performed and assessing whether the order in which\nthey are used by the algorithm is appropriate. We support this effort, which will result in a\nrevised set of software requirements that will aid in modifying the software to produce more\naccurate results. In a section to follow that addresses developing the software requirements\nspecification according to software engineering standards, we cite specific areas of ambiguity\nthat need to be addressed in a modified specification. The bureau\xe2\x80\x99s efforts to analyze the data\nand determine modifications to the algorithm that remove errors and ambiguities, together with a\nmore rigorous approach to defining the software requirements and implementing that algorithm\nshould result in improved MRR processing.\n\n\nII.    A More Structured Development Approach Would Improve MRR Software\n\nSuccessful software development depends on defining requirements as early in the project as\nfeasible. To ensure that software correctly implements the intentions of the users, requirements\nspecifications must be complete, consistent, unambiguous, and verifiable. Use of a software\nengineering standard to guide subject matter experts through the process of defining requirements\nhelps this effort. Applying such a standard to the development of the MRR specification would\nalso be beneficial in determining how MRR will compensate for questionnaire design issues,\nclarifying expectations for input data quality, assessing the impacts of changes in requirements,\nand guiding the testing effort.\n\nThe bureau is fortunate to have a team of experienced and dedicated analysts and developers who\nare implementing MRR for the decennial census. The team hired a contractor who has assisted\nthem in applying software engineering principles, which has had a very positive effect. For\ninstance, in spite of an already compressed schedule, the team submitted new software to peer\nreview and extended testing to a wider variety of cases. This section highlights the need for\nadditional efforts of this type.\n\n\n\n\n                                                 15\n\x0cU.S. Department of Commerce                                                              Report OSE-10711\nOffice of Inspector General                                                               September 1999\n\nA.     Bureau Software Engineering Standards Are Available but Not Widely Used\n\nThe bureau\xe2\x80\x99s software standards branch in the Office of the Associate Director for Information\nTechnology has written a manual that describes how to write software specifications by accepted\nsoftware engineering principles. Entitled The Census Software Development Life Cycle,25 this\nmanual unfortunately has not been used by the MRR team for its specifications. Members of the\nteam who wrote the manual are still involved in software development for the bureau and are\navailable to provide consulting support in applying the manual to the development process. They\ncan help a development team tailor the manual to fit the needs and constraints of an organization.\nFor instance, we believe applying the third section of the manual, \xe2\x80\x9cSoftware Requirements\nDefinition and Analysis,\xe2\x80\x9d which provides guidance on how to state rigorously what the software\nmust accomplish, would substantially clear up MRR issues. Two other standards available\nwithin the bureau are:\n\nl      Programming Standards and Guidelines Manual, Bureau of the Census, March 1991, and\n\nl      Decennial System and Contract Management Office Decennial Processing Systems\n       Software Development Process, Bureau of the Census, December 11, 1997.\n\nThe programming standards manual presents standards for how to write software code. It\nincludes sample code and guidelines for conducting peer reviews, which can be invaluable to the\nprocess. It also cites excellent references. The second manual was developed by the on-site\ncontractor and provides extensive and detailed instructions for many software engineering\nmethods to be applied throughout development. The MRR code and testing efforts show\nevidence that developers have followed some methods advocated by these documents. However,\nconformance to a standard occurred because of individual preference as opposed to a strategy\nestablished by bureau management and supported by policy and training.\n\nB.     The Bureau Should Define MRR Requirements More Explicitly\n\nTo determine the appropriateness and quality of the requirements specification, we reviewed the\nkey requirement of handling multiple responses associated with various form types and source\noperations, e.g., mail return, update/leave, and Be Counted program. For any household, MRR\nmust handle multiple responses occurring in various combinations of form type and source\noperation. During the dress rehearsal, MRR had to handle a total of 20 distinct form types\xe2\x80\x94\n9 long form types and 11 short form types\xe2\x80\x94associated with 11 operations. Table 2 shows which\nforms were distributed by each operation, grouped by long and short form.\n\nMRR processing takes several steps to determine how household and person data should be\nextracted from responses returned for a census housing unit ID. Decisions are based upon\n\n\n       25\n            The Census Software Development Life Cycle, Bureau of the Census, November 7, 1994.\n\n                                                      16\n\x0cU.S. Department of Commerce                                                      Report OSE-10711\nOffice of Inspector General                                                       September 1999\n\n\n\n\n                                             Table 2\n\n                Operations and Form Types Handled in Dress Rehearsal\n\n                    Dress Rehearsal Source           Long Form     Short Form\n                          Operations                   Types         Types\n\n                Coverage edit follow-up                 DX-2          DX-1\n\n                Mail return (MR)                        DX-2       DX-1   DX-1\n\n                Update/leave                         DX-2(UL)       DX-1(UL)\n\n                Update/leave ADD                     DX-2A(UL)     DX-1A(UL)\n\n                MR Replacement                          DX-2          DX-1\n\n                NRFU (barcoded)                        DX-2(E)       DX-1(E)\n\n                                                     DX-2(E)(cf)   DX-1(E)SUPP\n\n                NRFU (not barcoded)                    DX-2(E)       DX-1(E)\n\n                                                     DX-2(E)(cf)   DX-1(E)SUPP\n\n                Be Counted program (paper)                            DX-10\n\n                Group quarters/special places          DX-15B        DX-15A\n\n                                                       DX-20B        DX-20A\n\n                                                       DX-21          DX-21\n\n                TQA Be Counted                        TQA-10L       TQA-10S\n\n                Large household follow-up            DX-2(HF)       DX-1(HF)\n                orphans\n\n             NOT INCLUDED IN TEST DATA\n\n             TEST DECK TO DETERMINE MOST QUALIFIED RESPONSE FOR A HOUSEHOLD:        ID1, ID2, ID3\n\n\n\n\n                                                17\n\x0cU.S. Department of Commerce                                                      Report OSE-10711\nOffice of Inspector General                                                       September 1999\n\n\nvalues of variables such as the number of responses for a household, the number of people in a\nhousehold, the completeness of the 100 percent population data, whether the response is on a\nlong or short form, the source operation generating the response, and the date that the response\nwas received. To describe how decisions are to be made by the software, the specification sets\nup hierarchical rules, where lower rules act as \xe2\x80\x9ctiebreakers\xe2\x80\x9d for cases where both returns rank the\nsame according to all higher rules. However, because the specification does not explicitly list a\nrepresentative set of cases of multiple responses, present the resolution logic, and specify the\noutput, it is difficult to determine what the correct resolution of multiple responses should be\nunder various conditions. For example, it is unclear when:\n\nl      Two person records on different returns are compared to see if they belong to the same\n       household, whether a person record that has already been marked for deletion should still\n       be utilized in the comparison.\n\nl      A person entry on a short form response contains more complete data than the\n       corresponding entry on a long form response, how that data is consolidated into the long\n       form response.\n\nl      There are more than two responses for a census housing unit ID, which pairs of responses\n       are compared and how to reconcile inconsistent linking results.\n\nAdditionally, the specification cites the 100% Data Quality Index which measures the\ncompleteness of the population data for persons reported on the response, but the specification\ndoes not explain how to calculate the index. Also, it is unclear why more complete data for a\nhousehold, as measured by the index, does not carry more weight than its relatively low position\nin the rule hierarchy and processing steps currently credit it. Finally, the specification does not\ndefine contingencies in processing data that does not conform to items listed in the section titled\n\xe2\x80\x9cOperational Constraints and Assumptions.\xe2\x80\x9d Omissions such as those cited here leave the\nspecification incomplete for purposes of software development. A more comprehensive\nspecification written according to the \xe2\x80\x9cSoftware Requirements Definition and Analysis\xe2\x80\x9d chapter\nin the bureau\xe2\x80\x99s life cycle manual would help resolve these issues, as well as facilitate testing.\n\n\n\nIn addition to its current efforts to develop rules and specifications for processing incomplete\ndata, the bureau needs to implement error handling procedures. Robust software is designed to\nhandle input conditions consisting of data values that fall outside permitted values. These error\nhandling procedures may attempt to fix the data to conform to permitted range values when\nfeasible. When data is recognized as erroneous, the software may also report these conditions for\nfurther analysis.\n\n\n\n\n                                                18\n\x0cU.S. Department of Commerce                                                      Report OSE-10711\nOffice of Inspector General                                                       September 1999\n\nBureau standards that were cited on page 16 of this report offer some guidance about specifying\nand testing erroneous data conditions:\n\n       l       The Census Software Development Life Cycle: On pages 94 through 98 of this\n               manual, the contents of the specification of a requirement are described.\n               According to this manual, the processing section should include a description of\n               \xe2\x80\x9cresponses to abnormal situations.\xe2\x80\x9d The analyst could describe how to handle\n               (fix or omit this data and report action) in this portion of the specification.\n\n       l       Decennial System and Contract Management Office Decennial Processing\n               Systems Software Development Process: Appendix B of this manual contains a\n               description of various types of testing such as \xe2\x80\x9cequivalence partitioning\xe2\x80\x9d and\n               \xe2\x80\x9cboundary condition testing.\xe2\x80\x9d The requirements specification would describe\n               values or ranges of values that would be used for this testing.\n\nA.     MRR Software Testing Needs to Be More Complete\n\nThe MRR team conducted testing from April through September 1998 in preparation for the\ndress rehearsal. The testing began by breaking the specification\xe2\x80\x99s narrative description into\nindividual requirements and documenting this information in a test plan. The test plan describes\nthe requirements to be tested, test cases and the associated procedures, as well as the test data\n(referred to as test decks) for executing the procedures.\n\nThe test plan identifies 59 testable requirements. In addition, the plan describes test procedures\nto cover 43 test cases, where each procedure addresses one or more requirements. The test plan\nalso provides guidance for the development of test decks, which are intended to satisfy the\nconditions of one or more of the test cases. It also includes a section in which results from\nrunning the software with each test deck are recorded with an indication of success or failure,\nalong with an explanation of the nature of any failures. The final test plan, which indicated\nsuccessful processing of each test deck, was released on September 11, 1998.\n\nThe test plan served as the basis for a commendable testing effort that was performed with great\ncare under severe time constraints. The test decks developed by the test team afforded valuable\ninput data for the developer to correct software errors in preparation for testing with dress\nrehearsal data when it became available. A more complete set of test cases would further\nenhance this testing effort. In Table 2, entries highlighted in red denote conditions of form type\nand source operation that were not tested. For instance, the operation of coverage edit follow-up\nwas not tested for either the short or long form. Additionally, the operations update/leave and\nupdate/leave ADD were not tested for the short form, and so on. In all, 14 operation-form type\ncombinations out of a possible 30 were not tested. More tests addressing these omissions and\nensuring that the software handles the test data as intended by the requirements are needed to\nverify the correctness of the software.\n\n\n                                                19\n\x0cU.S. Department of Commerce                                                                         Report OSE-10711\nOffice of Inspector General                                                                          September 1999\n\nAs a further example, the test deck designed to test the choice of the most qualified response for\na household could include many more cases of multiple responses. This test deck uses the data\nfor three census housing unit IDs. The instances of form type and operation included are\nhighlighted in Table 2 in green, yellow, and blue; i.e., there are two instances each for ID1 and\nID2 and four instances for ID3. We suggest that multiple responses representing several more\ncombinations of form type and operation be tested to verify that the software chooses the most\nqualified response for a household.\n\nB.      Beginning-to-End Testing Is Needed to Ensure Consistent Results\n\nIn addition, beginning-to-end testing should be conducted with data containing comprehensive\nmultiple response instances for the questionnaire data path from DCS 2000 to creation of the\nCUF. Beginning-to-end testing starts with scanned images of the completed forms and proceeds\nwith formatting the data contained in DRF1. The next step is the linking of continuation forms\nand other processing that produce DRF2. The results of MRR processing of DRF2 data should\nbe analyzed for correctness. This testing is particularly important in ensuring that the results of\nthe matcher software are correct with respect to captured data and would check the correctness of\nall pre-edit files.26 If the quantity and variety of data are large enough, this effort will also stress\ntest the software.27\n\nEnabling headquarters processing personnel to view the scanned questionnaires\xe2\x80\x99 image data\nwould provide a further opportunity for quality assurance, and hence improved accuracy of the\nscanning procedures. The results of beginning-to-end testing should be used, as appropriate, to\nadjust matcher parameters to fine-tune the scoring process or, if necessary, to modify matcher\nsoftware. Beginning-to-end testing should be repeated as modifications to DCS 2000 affect the\ncaptured data.\n\n\n\nThe bureau has concurred with our recommendation for beginning-to-end testing and cited that\nthe Decennial Integrated System Test (DIST) is designed to fulfill this recommendation.\nHowever, this testing, although valuable and needed, will start in October, 1999, which is\nconsiderably earlier than MRR\xe2\x80\x99s development and testing time frame. In addition,\ndocumentation indicates that the DIST will treat headquarters processing of response data as a\nsingle entity. The limited amount of input data planned for the DIST may not sufficiently test the\nconditions that MRR is specified to handle. To meet this recommendation fully, the bureau\nneeds to conduct beginning-to-end testing with completed questionnaires that reflect a sufficient\nvariety of multiple response scenarios, submit the questionnaires to the processing stream starting\n\n\n        26\n             Pre-edit files contain all questionnaire data. Edited files contain subsets of the questionnaire data.\n        27\n             Stress testing is designed to overload the system in various ways.\n\n                                                            20\n\x0cU.S. Department of Commerce                                                      Report OSE-10711\nOffice of Inspector General                                                       September 1999\n\nwith DCS 2000 and continuing with headquarters processing including MRR. This testing\nwould not treat headquarters processing as a single entity but would test each component of\nheadquarters processing with testing tools that allow analysts and developers to monitor\nprocessing and isolate software errors. By using these tools, bureau personnel would be able to\ntrack processing decisions within each step, as well as to review the results of each step to find\nerrors.\n\n\nIII.   Improvements Are Needed to Make MRR Processing Timely for the 2000 Census\n\nThe Census Bureau has projected that MRR processing for the decennial census will need to be\ncompleted within 30 days to ensure timely availability of data for follow-on processing\noperations. Results for dress rehearsal indicate that unless improvements are made, MRR\nprocessing for the decennial census will require approximately 87 days. If MRR processing\ncannot be completed within the required time constraints using available computing resources,\nfollow-on operations will be impacted and census results will be late. Therefore, the bureau\nneeds to streamline MRR software design and data organization, upgrade computer resources, or\ninstitute some combination of streamlining and upgrading to ensure timeliness of MRR results.\n\nTo assess the efficiency of MRR software design and data organization, as well as the adequacy\nof the computer systems planned to run MRR software for the decennial census, we reviewed\nMRR processing times for the three dress rehearsal sites. Table 3 shows the total elapsed time\nrequired on a dedicated computer system to execute the MRR software for each dress rehearsal\nsite. The bureau planned to use the same headquarters computer system for MRR processing\nduring the decennial census. As noted, the bureau has allotted 30 days for MRR processing\nduring the decennial census. The total time to perform MRR processing for the approximately\n400,000 households in the dress rehearsal was 6.95 hours. Under the same conditions, it would\ntake roughly 300 times as long\xe2\x80\x942,085 hours or 87 days\xe2\x80\x94to perform the MRR processing for the\n120 million households expected in the decennial census. The bureau clearly will not have\nnearly this much time available. Even if the bureau were to eliminate the second mailing of\nquestionnaires, the time required to perform MRR processing would not be significantly affected\nsince fewer than 6 percent of households responded to both the initial and second mailings for\nthe dress rehearsal.\n\nThe MRR team has identified several ways to reduce the processing time. For example,\ndevelopers are considering using binary files, rather than ASCII files, to represent DRF2, the data\n\n\n\n\n                                                21\n\x0cU.S. Department of Commerce                                                                   Report OSE-10711\nOffice of Inspector General                                                                    September 1999\n\nfile which serves as input to the WBS and PSA software.28, 29 In general, binary files provide a\nmore compact representation of numeric data (e.g., age, number of persons in a household, year\nof birth) than ASCII files. Additionally, binary files provide computational efficiency over\nASCII files in that numeric data does not need to be converted from its ASCII representation\nbefore computations can be performed using data read from the file. Developers have also found\nthat using vendor-provided file compression software reduces both processing times and the\namount of disk storage used.\n\n\n                                             Table 3\n                              MRR Processing Times in Dress Rehearsal\n\n                                   Menominee         Columbia        Sacramento       Total (all three sites)\n WBS processing                            .0190         3.1830            1.9405                       5.142\n PSA processing                            .0165         1.1365             .6550                       1.808\n Total                                     .0355         4.3190            2.5950                       6.950\n Table entries represent processing time in hours.\n\n\n\nThe bureau has decided not search for matches for any persons outside the census housing unit\nID for which they are reported, which is a function performed by WBS for dress rehearsal. Based\non the evaluation studies for this element, bureau personnel have observed that few matches were\nfound and that the block-wide search conducted for dress rehearsal accounts for 70 percent of all\nthe processing time for MRR. Therefore, simplifying the design by eliminating searching for\nperson matches outside the census housing unit ID would realize considerable processing-time\nsavings. If the bureau determines that the trend observed in the dress rehearsal will likely be the\nsame for the census, then this would be a valid design change.\n\nBureau managers and staff have expressed concern not only about the adequacy of headquarters\ncomputing resources to support MRR, but other decennial census operations as well. As a result,\nwe believe that the MRR team should continue to pursue ways to streamline MRR software\ndesign and data organization and that bureau managers should upgrade headquarters computer\n\n\n\n         28\n          Binary pertains to a number system that has just two unique digits, 0 and 1. Computers are based on the\nbinary number system.\n         29\n           ASCII is the acronym for the American Standard Code for Information Interchange. ASCII is a code for\nrepresenting English characters as numbers, with each character assigned a number from 0 to 127. Most computers\nuse ASCII codes to represent text, and files used to store ASCII coded text are commonly referred to as ASCII files.\n\n                                                        22\n\x0cU.S. Department of Commerce                                                   Report OSE-10711\nOffice of Inspector General                                                    September 1999\n\nresources to ensure timely processing for MRR and other required operations for the decennial\ncensus.\n\n\n                                        CONCLUSION\n\nAs a result of the foregoing issues, the bureau must make important decisions concerning MRR\nrequirements and design and perform significant development and testing work. The bureau has\nrecognized many of the problems that we have discussed and is performing analysis of the dress\nrehearsal and evaluation data and working on improvements to MRR, as well as automated data\ncapture. However, the bureau has not fully defined the activities needed to refine MRR and\ncomplete its development. Because of the complexity and amount of work remaining to be done\nin time for the decennial, the bureau needs to define the remaining tasks and plan carefully for\ntheir completion.\n\nThe bureau needs to produce a rigorous and complete MRR requirements specification and\nshould use its software engineering standard, The Census Software Development Life Cycle\nmanual to do so. The bureau also needs to develop a comprehensive test plan following the same\ntemplate used for the dress rehearsal to ensure that the software correctly implements the MRR\nrequirements. Finally, the bureau needs to complete software development and testing in\naccordance with the specification and test plan. To accomplish these tasks successfully, the\nbureau should prepare a plan that identifies all of the decisions to be made and work to be\nperformed, along with the responsible organizations, a realistic schedule, and a complete list of\nmilestones.\n\nMRR is only one of many processing elements that manipulates questionnaire data from the\npublic and is used to create the final decennial tabulations. The same structured software\ndevelopment process that is needed for MRR, including clear and complete requirements and\nthorough testing, is essential for all software. Ensuring the correctness of the software also\nrequires beginning-to-end testing of all processing elements. Use of more rigorous software\ndevelopment methods and sharing data among processing elements in a coordinated fashion will\nhelp the bureau secure accurate results for the 2000 Decennial Census.\n\n\n\n\n                                               23\n\x0cU.S. Department of Commerce                                                   Report OSE-10711\nOffice of Inspector General                                                    September 1999\n\n                                  RECOMMENDATIONS\n\nWe recommend that the Director of the Census Bureau direct senior management for the 2000\nDecennial Census to take the necessary actions to:\n\n1.     Establish an appropriate standard of measuring the success of MRR processing.\n\n       The bureau concurred with this recommendation.\n\n2.     Identify both the accuracy of MRR in resolving multiple responses as well as the impact\n       of MRR errors on the total enumeration when reporting on the success of MRR.\n\n       The bureau concurred with this recommendation.\n\n3.     Define and implement \xe2\x80\x9cintelligent keying\xe2\x80\x9d guidelines to process data sent to the KFI\n       component of DCS 2000.\n\n       The bureau stated that it has developed and is reviewing such guidelines and will report\n       to the OIG as soon as a decision is made.\n\n4.     Develop a rigorous and complete revision of the MRR software requirements\n       specification by:\n\n       a.     Obtaining assistance from the Office of the Associate Director for Information\n              Technology in using Chapter 3 of the Census Software Development Life Cycle\n              manual to develop a revised software requirements specification.\n\n       b.     Specifying a revised set of software requirements necessary to implement PSA\n              with the highest probability of resolving multiple responses accurately as\n              determined by the MRR team\xe2\x80\x99s analysis of the evaluation data.\n\n       c.     Specifying a revised set of software requirements for use by the matcher by\n              creating a team of DCS 2000 contractor and bureau analysts to analyze data\n              produced by DCS 2000 and corresponding matcher output.\n\n       d.     Specifying the software requirements necessary to address data quality issues,\n              such as how to handle incomplete, inconsistent, or erroneous data.\n\n       The bureau concurred with Recommendations No. 4a through 4c and requested\n       clarification on No. 4d, which is provided on pages 18-19.\n\n\n\n\n                                              24\n\x0cU.S. Department of Commerce                                                   Report OSE-10711\nOffice of Inspector General                                                    September 1999\n\n5.     Plan, schedule, and implement all tasks needed to complete MRR definition,\n       development, and testing, including:\n\n       a.     Developing the revised MRR software requirements specification in accordance\n              with Recommendation No. 4.\n\n       b.     Evaluating approaches to streamline MRR design and data organization.\n\n       c.     Developing a test plan based upon the software requirements, using the same\n              template that the MRR team used for the dress rehearsal test plan.\n\n       d.     Developing the software in accordance with the specification.\n\n       e.     Testing the software in accordance with the test plan.\n\n       f.     Performing beginning-to-end testing of the questionnaire data processing stream\n              from DCS 2000 through MRR using available debugging tools.\n\n       The bureau concurred with Recommendation Nos. 5a through 5f. Further description of\nbeginning-to-end testing that would more fully satisfy Recommendation No. 5f is provided on\npages 20-21.\n\n6.     Assess whether selective use of clerical quality assurance procedures would be cost\n       effective as part of MRR.\n\n       The bureau stated that this recommendation has been resolved.\n\n7.     Determine the additional headquarters computing resources needed to ensure timely\n       processing of multiple response resolution and all other required operations for the\n       decennial census, and obtain these resources.\n\n       The bureau stated that this recommendation has been resolved.\n\n8.     Revise enumerator training to emphasize the importance of obtaining complete and\n       accurate input data because poor response data precludes accurate processing.\n\n       The bureau stated that this recommendation has been resolved.\n\nThe bureau\xe2\x80\x99s complete response is included as the appendix of this report.\n\n\n\n\n                                               25\n\x0c\x0c\x0c\x0c\x0c\x0c'